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IDENTIFICATION OF AND CLONING 
A MOBILE TRANSPOSON FROM ASPERGILLUS 

Field of the Invention 

The present invention is directed at the identification, cloning and sequencing of 
mobile transposons or transposable elements from Aspergillus niger var. aivamori. The 
transposable elements, referred to as Vaderand Tan1, are approximately 437 base pair 
(bp) and 2.3 kb elements, respectively. The Vader and Tan1 elements are bounded by 
inverted repeat sequences of 44 and 45 base pairs, respectively. The transposable 
elements target a "TA" sequence in target DNA during insertion. In addition, the present 
invention is directed at the identification, cloning and sequencing of one or more 
transposable element(s) from other filamentous fungi using as a probe DNA comprising the 
Vader element 44 bp or the Tan1 element 45 bp inverted repeat isolated from Aspergillus 
niger war. awamori. Also provided are methods for utilizing either the Vader or Tan1 
elements to inactivate genes (for example, by inserting the transposon into the gene to be 
inactivated), to overexpress a gene (by, for example, inserting a known promoter or other 
regulatory gene within the inverted repeats of Vader or Tan 1 and allowing the DNA of the 
IR-promoter-IR to jump in front of (and overexpress) a gene of interest) or to act as an 
activation marker to, for example, identify new promoters. 

Background of the Invention 

It is well know that transposons are a class of DNA sequences that can move from 
an episome to a chromosomal site or from one chromosomal site to another. Transposons 
are known in both prokaryotes, such as bacteria, as well as in eukaryotes, although there 
have been few transposons isolated from filamentous fungi. 

Several groups have looked for transposons in filamentous fungi. The element 
pogo, which exists in multiple copies and at different sites in different strains of Neurospora 
crassa, was described by Schectman (1) and is believed to be a transposon. To date the 
most characterized transposon in filamentous fungi is Tad. Tad was isolated as a 
spontaneous mutant in the am (glutamate dehydrogenase) gene in an Adiopodoume strain 
of W. crassa isolated from the Ivory Coast. To detect mutations caused by insertion of a 
transposable element, Kinsey and Helber (2) isolated genomic DNA from 33 am mutant 
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transposable element, Kinsey and Helber (2) isolated genomic DNA from 33 am mutant 
strains which were then screened by Southern analysis for restriction fragment size 
alterations. In two of the mutant strains, the mutation was shown to be caused by the 
insertion of a 7 kb element (Tad) into the am gene. Subsequently Kinsey (3) demonstrated 
that Tad was able to transpose between nuclei of heterokaryons, confirming that Tad was a 
retrotransposon and that there was a cytoplasmic phase involved in the retrotransposition 
events. More recently, Cambareri et al. (4) demonstrated that Tad was a LINE-like DNA 
element with two major open reading frames (ORFs) on the plus strand. Typical of LINE- 
like elements, Tad had no terminal repeats. Attempts to isolate mobile transposons in 
laboratory strains of N. crassa were unsuccessful. 

A second retrotransposon was cloned by McHale et al. (5), who reported the 
isolation of CfT-1, an LTR-retrotransposon from Cladosporium fulvum. This transposon 
was 6968 bp in length and bounded by identical long terminal repeats of 427 bp, a 5 bp 
target site duplication. Virus-like particles were detected which co-sediment with reverse 
transcriptase activity in homogenates of this fungus. 

Daboussi et al. (6) were the first to successfully use the niaD (nitrate reductase) 
gene as a transposon trap. The niaD mutants can be isolated by a direct selection for 
chlorate resistance (7) . The strategy employed was to isolate niaD mutants amongst six 
isolates belonging to different races of the fungus Fusarium oxysporum. More than 100 
niaD mutants were isolated from each isolate and examined for instability. One strain, F24, 
yielded up to 10% unstable niaD mutants. Assuming that the genetic instability of the niaD 
mutants was caused by transposable elements, it seemed plausible that this isolate 
contained mobile transposons. A stable niaD mutant in the F24 was transformed with the 
cloned niaD gene from A. nidulans because the F. oxyspomm niaD gene had not been 
cloned. Unstable niaD mutants were isolated in transformants containing the A. nidulans 
niaD gene. Two unstable niaD mutants were shown by Southern blot analysis to contain a 
insertion of 1.9 kb in size. Analysis of this element, Fot1, revealed it was 1928 bp long, had 
a 44 bp inverted terminal repeats, contained a large open reading frame, and was flanked 
by a 2 bp (TA) target site duplication. Very recently, Daboussi et al. (8) have reported the 
cloning of a new transposable element from an unstable niaD mutant. This element, FML 
{Fusarium manner-like), is 1280 bp long and has inverted repeats of 27 bp. The FML 
element inserts into a TA site and excises imprecisely. 

Using the characterization of unstable niaD mutants strategy, Lebrun et al. (9) were 
able to isolate a transposon from Magnaporthe ghsea. However, in this case the A. 
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nidulans niaD gene which was transformed into M. grisea by transformation was used as a 
transposon trap. The element inserted into the niaD gene was shown to belong to a family 
of M. grisea LTR-retrotransposons, Fos1 (Schull and Hamer, unpublished) and Magi 
(Farman and Leong, unpublished). The cloned retro-element was 5.6 kb and the target site 
(ATATT) was shown to be duplicated. All revertants from this mutant examined had one 
copy of the LTR left at the point of insertion. A second transposon, Pot2, from M. grisea 
was recently cloned by Kachroo et al. (10). The strategy used to clone Pot2 was to analyze 
the fingerprint patterns of repetitive DNA's which were cloned from the M. grisea genome. 
A repetitive family present in both rice and non-rice pathogens of M. grisea in high copy 
number was cloned. The element, 1857 bp in size, has a 43 bp perfect terminal inverted 
repeats (TIR) and 16 bp direct repeats within the TIRs. An open reading frame was shown 
to display extensive identity to that of Fot1 of F. oxysporum. As with Fot1 , the Pot2 
element duplicates the dinucleotide TA at the target insertion site. Pot2 was shown to be 
present at a copy number of approximately 100 per haploid genome. 

Several groups have reported looking without success for transposons in laboratory 
strains of A. nidulans (Kinghom personnel communication, 5). One explanation for the 
lack of transposons in laboratory strains is that the desirable features of strain stability 
required for genetic analysis may preclude strains with mobile transposon. By using the 
niaD gene as a transposon trap we have identified and isolated a transposable element 
from the industrially important fungus A. niger var. awamori. This element, Vader, is 
present in approximately 15 copies in A. niger and A. niger var. awamori. Southern 
analysis of A. nidulans with this element indicates that this transposable element was 
absent from one laboratory strain and only present as a single copy in a second laboratory 
strain. These results support the notion that laboratory strains of A. nidulans contain very 
few transposons. 

Brief Description of the Invention 

In accordance with the present invention, novel eukaryotic transposable elements 
from Aspergillus niger var. awamori are provided. The larger transposable element, 
referred to herein as Tan1, is 2.3 kb in size. The smaller transposable element, referred to 
herein as Vader, is a 437 bp element (SEQ ID NO:3). Vader is found within the larger 
element Tan1 . The Vader transposable element is a 437 bp element which comprises a 44 
bp inverted repeat sequence at either end of the transposable element. Tan1 is 
approximately a 2325 bp element which comprises 45 bp inverted repeats at either 
terminus and internal IRs. Tan1 comprises a 555 aa open reading frame (ORF) which 
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codes for a transposase which allows the elements (Tan1 or Vader) to "hop" or insert 
themselves in the genome of a host. The target for insertion of these novel transposable 
elements is a "TA" sequence in the target DNA for insertion. The "TA" sequence is 
repeated at either end of the transposon upon insertion of the transposable element into 
5 the target DNA. Therefore, the present invention provides the larger Tan1 transposable 
element as well as the smaller element (Vader) internal thereto, as well as the DNA 
encoding each. 

Another embodiment of the present invention comprises a fragment of the Vader or 
Tan1 transposable elements which comprise the 44 or 45 bp (respectively) inverted repeat 

10 sequences found at either terminus of the transposable element from A, niger var. 

awamori, as well as the use of said fragments as probes to hybridize under low stringency 
conditions to DNA of other filamentous fungi for the isolating and/or cloning of transposable 
elements from such other filamentous fungi. While the exact 44 bp IR of Vader or the 45 
bp IR of Tan1 can be utilized, it is well understood by those skilled in the art that variation of 

15 such DNA would also work as a suitable probe. For example, at a minimum, the imperfect 
direct repeats within the IRs of Tan1 would be suitable to use as probes for isolating 
transposable elements from other filamentous fungi. Initially the inverted repeat of Vader 
was used to clone Tan1 using PCR techniques. This work was followed by obtaining a 
genomic copy of Tan1 from a partial library. 

20 Another embodiment of the present invention is the transposase activity coded for 

by the ORF of Tan1. This transposase is 555 aa (SEQ ID NOS:7 or 14, PCR and genomic, 
respectively). 

In a process embodiment of the present invention there are provided methods for 
gene tagging comprising using the transposable elements of the present invention (Vader 

25 or Tan1 or any transposable element isolated using the IRs of either) to inactivate genes 
via insertion of the element into a given gene, thus disrupting or inactivating gene 
expression. Alternatively, the transposable element can be used in activation tagging (to 
activate or turn on genes) rather than for gene disruption. For example, by inserting DNA 
coding a promoter into the transposable element and then allowing such transposable 

30 element to become inserted 5' to a desired gene, the promoter may be activated to drive 
the expression of the desired gene product or to turn on cryptic pathways. Additionally, 
gene tagging can be utilized to activate marker genes by inserting a marker gene within the 
IRs of a transposon of the present invention. This marker gene can then "hop" into 
targeted DNA and, if expression of the marker is selected for, it will be possible to identify 
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This may lead to identification of isolation of new 



Brief Description of the Drawings 

Fig. 1 shows the Southern blot analysis of unstable niaD mutants. PCR-amplified 
genomic niaD gene from four niaD mutants and UVK143f were digested with Bglll (sites are 
3' of all inserts). Blot probed with 500 bp fragment of Sail digested PGR product of niaD1 
and niaD2. Wild-type band hybridizes at 2.5 kb while gene with insertion hybridizes at 2.9 
kb. Lanes: 1=MW marker III (Boehringer Mannheim); 2=UVK143f; 3=n/aD410; 4=n/aD436; 
5-niaD 587; 6=niaD392. 

Fig. 2 depicts the mapping of Vader insertions within the niaD gene. The positions 
of Vader insertions 1-4 (r?/aD410, n/aD436, n/aD587 and n/aD392, respectively) are shown 
relative to the six introns of the structural gene coding region. Because the exact site of 
insertion for Vader-1 and Vader-4 is still unknown, they have been presented using the 
approximate area of insertion. Relevant restriction sites are shown using the following 
letters: E=EcoRI, S=Sa/l, Sp=Sp/?l, K=Kpnl, and B=flg/ll. 

Fig. 3 shows Southern blot analysis to determine Vader genomic copy number. 
Four A. niger var. awamori niaD mutants and UVK143f were digested with EcoRV to 
completion. EcoRV cuts the Vader sequence once. Hybridization indicates that Vader is 
present in the genome in more than 14 copies. The hybridizing bands of niaD 392 t which 
are different from the other mutants and UVK143f, suggest that the Vader sequence is 
mobile. Lanes: 1=MW marker III, 2=UVK143f, 3=n/aD410, 4=n/aD 436, 5=n/aD 587, 
6=n/aD 392. 

Fig. 4. Southern blot to determine presence of Vader sequence in other fungi. 
Other filamentous fungi, an industrial production strain and niaD mutant 392 were digested 
with EcoRV to completion. Low stringency hybridization (32) indicates that sequences 
homologous to Vader are present in A. nidulans (FGSC A237), A. cinnamomeus, A. 
phoenicis, A. foetidus, an industrial A. niger strain. Lanes: 1=MW marker, 2=A foetidus, 
3=an industrial glucoamylase production strain of A. niger (ETC #2663), 4=>A. niger var. 
awamori niaD mutant 392, 5=A. phoenicis (ATCC #1 1362), 6=A nidulans (FGSC A691), 
7=A wentii (ATCC #10593), 8=A versicolor, 9= A. cinnamomeus (ATCC #1027), 10=A 
nidulans (FGSC A237) 

Fig. 5. Southern blot to determine Tan1 (transposon from A. niger) genomic copy 
number. Four niaD mutants A. niger war. awamori mutants and UVK143f were digested 
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with EcoR\ to completion. EcoRl cuts the Tan1 sequence once. A probe corresponding to 
the ORF region (see Fig. 9) was used in the hybridization. Hybridization indicates that 
Tan1 is present as a single copy in the genome. Lanes: 1=MW marker III, 2=UVK143f, 
3=n/aD410, 4=n/aD 436, 5-niaD 587, 6' niaD 392. 

Figs. 6A-6C. Southern blots to determine if the inverted repeats of transposable 
elements Fot1 and Pot2 will hybridize to elements in A. niger var. awamori. Four niaD 
mutants A. niger war awamori mutants were digested with EcoRl to completion. EcoRl cuts 
the Tan1 sequence once. Inverted repeat oligonucleotide probes of Vader (SEQ ID NO:5), 
Fot1 and Pot2 were labeled with digoxigenin (Boehringer Mannheim). Lanes: 1=MW 
marker III, 2=n/aD436, 3=/?/aD587. Blot A (lanes 1-3) and B and C were probed with the 
labeled inverted repeat probes of Vader, Fot1 and Pot2, respectively. 

Fig. 7 shows the sequence of the Vader insertion (SEQ ID NO:3) as generated by 
PCR. Vader was found to be 437 bp in length. The 44 bp inverted repeat of the Vader 
insert corresponding to SEQ ID NO:4 (the 5* IR) and SEQ ID NO:5 (the 3' IR), respectively, 
from the 5' end to the 3' end of Vader are underlined, the single mismatch which occurs in 
the inverted repeats is identified in bold, and the TA 2 bp duplication is shown in bold print. 
niaD sequences flanking the element are shown in lower case letters. 

Figs. 8A and 8D show the entire DNA sequence of the Tan1 element (SEQ ID 
NO:6) as generated by PCR, as well as the putative amino acid sequence of the 
transposase coded for by Tan1 (SEQ ID NO:7). Tan1 as generated by PCR is 2320 bp in 
length (excluding the unknown nucleotides shown as "N" in the figure) and has a large 
open reading frame of 1668 bp which encodes for 555 amino acids (SEQ ID NO:7). Tan1 
comprises the sequences of four inverted repeats (underlined) similar to those found in 
Vader. 

Fig. 9 shows a schematic presentation of Vader and Tan1 elements. Dark boxes 
represent the 45 bp (Tan1) and 44 bp (Vader) inverted repeats. The unique EcoRl site in 
the Tan1 element was used for digestion of genomic DNA in Southern analysis (Figs. 5 and 
10). Bold, horizontal lines above the Tan1 element indicate the probes corresponding to 
the end of the ORF and Vader used in Southern analysis shown in Fig. 10 and Fig. 5. 

Fig. 10 shows Southern analysis of A. n/gervar. awamori niaD mutants (n/aD410, 
n/aD436, niaD587, niaD392) and the wild-type UVK143f: lane 1, molecular weight marker 
III (Boehringer Mannheim); lane 2, UVK143f; lane 3, n/aD410; lane 4, niaD436\ lane 5, 
n/aD587; lane 6, niaD392. This blot was probed for the Vader element (see Fig. 9). When 
this blot (Fig. 10) was superimposed with the blot shown in Fig. 5, one of the illuminated 
bands from the Vader-probe hybridization overlaid the single band in the ORF-probe 
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hybridization indicating that the Tan1 element is composed of contiguous ORF and Vader 
elements. 

Figs. 11A and 11D show the nucleotide sequence (genomic copy) of Tan1 (SEQ ID 
NO: 13). The amino acid sequence encoding the putative transposase (555 aa) (SEQ ID 
5 NO: 14) is shown below the DNA sequence in the one-letter amino acid code. The inverted 
repeats are underlined (SEQ ID NOS:1, 2, 15 and 16, respectively, 5* to 3') and the 
imperfect direct repeats within the inverted repeats are shown with arrows above or below 
the sequence. The gaps within the arrows indicate the imperfect nucleotides within the 
direct repeats. Undetermined sequence is denoted in the figure by question marks and in 
10 the sequence listing as "N." The figure shows the DNA sequence as 2324 base pairs, 
excluding the unknown nucleotides indicated by "?" in the figure. 
Detailed Description of the Invention 

While the specification concludes with claims particularly pointing out and distinctly 
claiming the subject matter regarded as forming the present invention, it is believed that the 
15 invention will be better understood from the following detailed description of preferred 
embodiments. 

Standard biochemical nomenclature is used herein in which the nucleotide bases 
are designated as adenine (A); thymine (T); guanine (G) ; and cytosine (C). N connotes 
any of these nucleotides. As is conventional for convenience in the structural 

20 representation of a DNA nucleotide sequence, only one strand is usually shown in which A 
on one strand connotes T on its complement and G connotes C. 

Applicants have isolated two transposable elements from A niger var. awamori. The 
cloned element Vader was identified by screening unstable nitrate reductase (n/aD) 
mutants for insertion. This element is present in approximately fifteen copies in the 

25 genome of A. niger strains examined. In contrast, the Vader element is present in one 
copy in only one of the two A. nidulans strains studied. These results explain why several 
groups have been unsuccessful in isolating active transposons in laboratory A. nidulans 
strains. A plausible assumption is that "domesticated" strains of A. nidulans have lost their 
transposons due to repeated manipulation of such strains and the possible discarding of 

30 aberrant A. nidulans strains displaying genetic instability. 

The Vader element shows similarities to transposable elements cloned from the 
plant pathogens Pot1 from M. ghsea (12) and Fot1 from F. oxysporum (8). The target site 
for duplication in all three fungi is a 2 bp TA sequence. In the case of Fot1, this transposon 
does not excise precisely. In two niaD revertants examined, the excision products retained 

35 a 4 bp insertion relative to the wild-type gene (TAATTA versus TA). The insertion studied 
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was integrated into an intron, therefore, imprecise excision of Fot1 did not effect the 
functionality of the niaD gene product. There is no published evidence that Pot2 is a 
functional element. 

A homology search made at the nucleotide level gave a strong 60.7% homology 
between Tan1 and a 1230 bp overlap to the A. oryzae agdA gene coding for an a- 
glucosidase (33). This homology search revealed that the last 1.2 kb of a total of 5.2 kb of 
the a-glucosidase sequence submitted to GenBank is, in fact, part of a novel transposon, 
hereinafter called 7ao7 (transposon Aspergillus oryzae), which also belongs to the Fot1 
family. Only the 5' half of the Tao1 element is included in the GenBank sequence, thus, for 
the lack of comparison, the exact size of the inverted repeat cannot be determined. 
However, it can be concluded that there are 13 bp perfect direct repeats within the inverted 
repeat. The inverted repeat is flanked by a TA-dinucleotide, suggesting a commonly 
occurring TA-insertion site. Direct analyses gave only short ORFs, but when the often- 
occurring stop codons were ignored, a long ORF was obtained which shared over 50% 
identity to the Tan1 transposase. Multiple stop codons indicate that the A, oryzae Tao1 is a 
defective element. This transposable element from A. oryzae, thus, is within the scope of 
the present invention as, based on the high degree of sequence homology between the 
Tan1 and Taof, it is believed that Tao1 would hybridize to a probe comprising Tan1 or 
Vader IRs or variations thereof. The sequence of the IR of Tao1 is provided as SEQ ID 
NO:17. This IR (Tao1) or the !Rs from Tan1 or Vader may be used to isolate other 
transposable elements from filamentous fungi. 

In an attempt to determine if there were transposons similar to those reported for F. 
oxysporum and M. grisea, synthetic oligomers were made corresponding to the inverted 
repeats of both Fot1 (7) and Pot2 (10). When Southern analysis of A. niger var. awamori 
was conducted using the Vader 44 bp inverted repeat (SEQ ID NO:5) as a control, no 
conclusive hybridizations could be detected with either the Fot1 or Pot2 oligomeric probe. 
These results indicate that elements with high identity to F. oxysporum Fot1 and M. grisea 
Pot2 are not found from A. niger var. awamori genome. 

With regard to the structure of the Vader element, elements which transpose 
directly through DNA copies are typified by having inverted terminal repeats. Elements 
which transpose through reinsertion of the product of reverse transcription of an RNA copy 
of the element (retroelements) can be without long terminal repeats such as the Drosphilia I 
element (for a review see (16)). Alternatively, retrotransposons can have long terminal 
repeats such as the Drosphilia copia element. The Vader inverted repeats shown in Fig. 7, 
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SEQ ID NOS:4 and 5, respectively, have a single mismatch. Elements which transpose 
through DNA copies typically have open reading frame(s) which encode a transposase 
activity. The Fot1 element is 1.9 kb in length and the Pot1 element 1.8 kb in length. Both 
the Fot1 and Pot1 elements have ORF encoding for a putative transposase-like protein. 

5 The Vader element, although mobile, does not have an ORF and hence it was deduced 
that the mobility of Vader was dependent upon a transposase activity present elsewhere in 
the genome. A synthetic 44 bp oligomer of the inverted repeat of Vader (SEQ ID NO: 5) 
was used to clone, via PCR, a 2.3 kb element. This element, called Tan1 (SEQ ID NO:6), 
comprises four inverted repeats (SEQ ID NOS:1, 2, 15 and 16 from 5' to 3\ respectively) 

10 similar to those in Vader and has a unique organization IR-ORF-IR-IR-Vader-IR. Tan1 is 
2324 bp in length and has a large open reading frame (1668 bp) which encodes a putative 
transposase comprising 555 amino acids (shown in SEQ ID NOS:7 and 14), which is 
homologous to Fot1 and Pot2 transposases. Immediately 3' to the second IR (SEQ ID 
NO:2), which bounds the transposase, is a copy of the Vader element. We hypothesize 

is that at some stage the independent Vader element, although inactive by itself, has arisen 
from Tan1, resulting in current strains with only one copy of Tan1 providing transposase 
activity and numerous mobile copies of Vader dispersed in the genome. 

Thus, applicants have been the first to identify a transposable eiement(s) with 
certain Aspergillus species. These transposable elements are believed to be quite useful in 

20 the development of gene tagging systems for Aspergillus or other microorganisms. Basic 
requirements for developing a gene tagging system are that the tagging element can be 
distinguished from the endogenous elements, it displays little sequence specificity for 
transposition and that excision is followed by integration at a new site. More refined 
tagging systems include ability to monitor excision and reinsertion by, e.g., activation of 

25 antibiotic resistance genes and ability to stabilize the mutations by, e.g., a two transposons 
system (23, 24 and 25). 

For development of a tagging system for Aspergillus, it is proposed that the system 
is tested first in A. nidulans, which we have already shown does not have endogenous 
Tan1 or Vader sequences. However, at this stage the Vader element is altered from the 

30 original in such a way that the same construction can be later used in A. niger war. awamori 
and be distinguished from the endogenous Vader elements. 

In a model tagging system using Vader as the "mutator," a first vector can be 
constructed for expression of the Vader element, similar to the non-autonomous maize Dc. 
The internal sequence of the Vader element is altered to contain translation initiation and 

35 stop codons in three different frames. This sequence can later be used as a recognition 
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site for a probe in PCR analysis of the mutants. This altered Vader element, Vader-S, is 
inserted within an expression cassette conferring antibiotic resistance such as hygromycin 
resistance. Since excision of Vader may not always be precise, Vader-S is inserted in the 
promoter area (e.g., o//C) between the transcription and translation initiation sites. This 
disrupted hygromycin phosphotransferase cassette is flanked by marker genes - or 
alternatively the marker gene upstream of the hygromycin promoter can be placed within 
Vader. These marker genes can be used for monitoring whether the hygromycin gene, and 
Vader within it, have integrated in full length. A vector, for example, Vector I, containing 
these elements will be transferred to A. nidulans and transformants expressing the two 
marker genes, but sensitive to hygromycin, are selected. Screening of mutants at later 
stages is easier, if the transformant selected for mutagenesis has only one to two copies of 
Vector I sequences integrated in its genome. 

A transformant with only a few (preferentially one) intact Vader-S/hygromycin 
phosphotransferase cassettes integrated in its genome is retransformed with Vector II, 
which is an autonomously replicating vector carrying the transposase encoding gene. The 
autonomously replicating vector, pHELP, used as a basis for DNA construction work, can 
be segregated away by methods known to those skilled in the art. This enables 
stabilization of the Vader-S element after the mutagenesis step. Vader-S is activated by a 
transposase (from Tan1) in pHELP, which can be monitored by activation of the 
hygromycin resistance gene. Tan1 is not cloned into the vector in full length to disrupt its 
mobility. Again, Vector II contains a marker gene used for screening of transformants and 
also for monitoring its segregation after the sporulation phase. 

Marker genes can either complement host mutations or be dominant markers such 
as benomyl R t acetamidase or p-glucuronidase (GUS). 

In a model system for gene tagging the target gene for mutagenesis should be one 
with a simple plate screen, e.g., disruption of the niaD gene (by insertion of Vader), which 
can be screened by selection of chlorate resistant mutants and the gene disruption can be 
further mapped by a plate test using different nitrogen sources (no growth on nitrate, 
growth on nitrite, xanthine and uric acid). Another target gene for mutagenesis could be an 
acid protease gene. It has been shown previously for A. niger that disruption of this one 
protease is sufficient to abolish halo formation almost completely on skim milk plates. 

The advantage of using transposon tagging is that the mutants produced can be 
identified by subsequent isolation of the mutated gene. There are several methods 
available for PCR amplification of genomic sequences when only one end of the sequence 
is known - which, in this case, is the transposable element. PCR methods developed for 
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genomic walking are, e.g., "Inverse PCR" (27 and 28), "Vectorette PCR" (29) and 
"Panhandle PCR" (30). 

Setting up the transposon tagging system can be followed by studies of excision 
frequency, environmental influences on transposition frequency (24, 31), activation of the 
transposase by a heterologous promoter and effect of altered inverted repeats on transposition. 

Transposon tagging does need to be applied for inactivation of genes. 
Alternatively, tagging can be used to insert promoter sequences in Vader and therein 
activate genes. A third option is to insert a promoterless marker gene in Vader, in which 
case the transposon can be used in search for novel, strong fungal promoters. 

Experimental 

Materials and Methods 

Strains. Vader and Tan1 elements were isolated from Aspergillus niger var. 
awamori UVK143f, derived from Northern Regional Research Laboratories (NRRL) #3112. 
£. coli JM1 01 [F traD36 Iac1 q A(lacZ)M15 proA*B* IsupE thi A(lac-proAB)] and Epicurian 
coli SURE 2 (Stratagene Cloning Systems, La Jolla, CA) were used for propagation of 
Vader and Tan1 subclones, respectively. 

Spontaneous chlorate resistant mutants were derived from Aspergillus niger var. 
awamori UVK143f (NRRL #3112). The following Aspergillus strains were obtained from the 
ATCC: A. cinnamomeus (ATCC #1027), A. wentii (ATCC #10593), and A, phoenicis 
(ATCC #11362). A. nidulans (FGSC #A237), a nitrate reductase structural gene mutant 
(n/aD15), and A. nidulans (FGSC #A691), a tryptophan requiring mutant (fcpC801), were 
obtained from Fungal Genetics Stock Center (FGSC), Dept. of Microbiology, University of 
Kansas Medical Center. A. versicolor, A. foetidus, and a proprietary A. niger glucoamylase 
strain are from the Genencor International Inc. culture collection. 

Mutant Selection. Spore suspensions (1 x 10 8 ) of UVK143f were plated on CM 
agar (11) containing 600 mM KCIO3 and 10 mM glutamic acid. Chlorate (KCI0 3 ), a toxic 
analog of nitrate, allows selection of mutants in the nitrate assimilation pathway by chlorate 
resistance. Plates were incubated at 37°C until individual colonies of spontaneous mutants 
could be identified. Single mutants resistant to KCIO3 were allowed to sporulate on CM 
plates and spores from these plates were then streaked onto minimal media (11) with 
various sole nitrogen sources (10 mM): NaN0 3 (nitrate), NaN0 2 (nitrite), hypoxanthine, uric 
acid or NH 4 Cl (ammonium chloride). Each of these compounds are intermediate products 
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of the nitrate assimilation pathway. niaD mutants were identified as those resistant to 
KCI0 3 and able to grow in the presence of all pathway intermediates, except for NaN0 3 . 

Isolation of Vader via PGR Amplification. Genomic DNA of A. niger var. awamori 
niaD mutants and UVK143f was used as template (see Southern Analysis). Primers (50 
pmol) used for amplification of the niaD gene were NiaD1 (position 142-165 relative to the 
initiation site of niaD): 5'-CCAACCGAGTCCTCAGTATAGAC-3' (SEQ ID NO:8) and NiaD2 
(position 2738-2715): 5-CAACGCTTCATAGGCGTCCAGATC-3' (SEQ ID NO:9). Deep 
Vent (exo~) DNA polymerase (New England Biolabs) was used with the buffer and dNTPs 
provided by the manufacturer. For optimal amplification of the niaD gene the reaction 
mixture contained 4 mM MgS0 4 . Denaturation of template DNA, 2 min. at 94°C, was 
followed by 30 cycles of denaturation (30 sec. at 94°C), annealing of primers (45 sec. at 
55°C) and extension (4 min. at 72°C). PCR fragments were purified from gel using the 
Qiaex DNA gel extraction kit (Qiagen), digested and used for restriction enzyme analysis by 
standard procedures (12). 

Confirmation of Excision Foot Print by PCR Amplification and Sequencing. 
Template DNA from niaDAZQ was used in a PCR reaction in an attempt to amplify both the 
larger niaD sequence with an insert and the shorter niaD fragment resulting from excision 
of the Vader element. The PCR reaction was conducted as previously described, except 
for using primers MA003 (positions 359-378): 5- 

ATATGAATTCCTTCTTGACTTCCCCGGAAC-3 , (SEQ ID NO:11) and NiaD5 (position 
1125-1144): 5-ATATAAGCTTGTCACTGGACGACATTTCAG-3' (SEQ ID NO:12). The gel 
purified fragment (ca. 800 bp) resulting from the excision event was submitted for 
sequencing. 

Isolation of Tan1 via PCR Amplification. Fungal genomic DNA for PCR and 
Southern analyses was isolated from mycelia grown in CSL supplemented with 5% fructose 
(21). Genomic DNA of A. nigervar. awamori niaD 436 mutant (22) was used as a 
template. A single primer (100 pmol), IR1, was used for amplification of Tan1. The 54-mer 
IR1 was derived from the 44 bp inverted repeat sequence of Vader preceded by a 
restriction enzyme recognition site for EcoRI: 5-ATATGAATTC ACGTAATCAA 
CGGTCGGACG GGCCACACGG TCAGGCGGGC CATC-3' (SEQ ID NO: 10). Deep Vent 
(exo") DNA polymerase (New England Biolabs) was used with the buffer and dNTPs 
provided by the manufacturer. Denaturation of template DNA, 10 min. at 94°C, was 
followed by 30 cycles of denaturation (1 min. at 94°C), annealing of primers (1 min. at 
55°C) and extension (6 min. at 72°C). PCR fragments were purified from agarose gels 
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using the Qiaex DNA gel extraction kit (Qiagen) and subcloned as blunt-ended inserts into 
EcoRV cut pSL1180 (Pharmacia Biotech). 

Estimation of niaD Mutant Reversion Frequency. Spores from niaD mutants 
n/aD392, n/aD410, niaDAZQ and niaD587 were streaked onto minimal media containing 

5 NalMOa as a sole nitrogen source. Nitrate non-utilizing colonies of niaD mutants, which had 
a spidery appearance and did not sporulate, were streaked onto CM containing 600 mM 
potassium chlorate (KCI0 3 ) and incubated to confluency at 37°C. Ten-fold dilution series of 
spore suspensions (in 0.8% NaCI-0.25% Tween 80) of n/aD392, n/aD410, n/aD436, 
n/aD587and UVK143f wild-type spores were plated on minimal media with nitrate (10 mM) 

to to determine reversion frequency, and on CM to determine viability. 

Southern Analysis. Genomic DNA for PCR and Southern analysis was isolated 
(13) from mycelia grown in CSL (13), which contained 600 mM KCI0 3 in order to reduce 
reversion of niaD back to the wild-type during cultivation. DNA (10 pg) was digested with 
either Bglll, which leaves the insertion intact in the niaD gene, or with EcoRV, which cuts 

15 the insertion element (Vader) once, and thus enables determination of its copy number in 
the genome. Genomic DNA (approximately 10 pg) of A. nidulans, A. cinnamomeus, A. 
versicolor, A. wentii, A. phoenicis, A. foetidus and of an industrial A. niger strain were 
digested with EcoRV to obtain an estimate of Vader copy number in these fungal genomes. 
The digested and get-separated DNA was transferred to a positively-charged nylon 

20 membrane (Boehringer Mannheim) by capillary action. 

The DNA probe for the niaD gene was derived from the PCR product (UVK143f 
DNA template amplified with primers NiaD1 (SEQ ID NO:8) and NiaD2 (SEQ ID NO:9)) t 
which was digested with Sail, resulting in a 528 bp probe fragment. The probe for the 
insertion element, Vader, was derived from a PCR reaction in which n/aD436 DNA was 

25 used as a template. This PCR product was purified and digested with Sail and Sphl and 
subcloned into the vector pUC19. This subclone was digested with Seal and Xbal to yield a 
236 bp fragment which was used for estimation of the copy number of Vader sequences in 
the genomes of various fungi. 

A DNA labeling and detection kit (Genius 1, Boehringer Mannheim) was used for 

30 random primed labeling of probe DNA with digoxigenin, and for detection with aikaline- 
phosphatase labeled antibody to digoxigenin. 

Hybridization and washing conditions for homologous probes were conducted as 
recommended by the manufacturer using hybridization buffer without formamide at 68°C 
(Boehringer Mannheim). Hybridizations for heterologous Southern analysis (i.e., analysis 
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of DNA from other Aspergillus sp.) was conducted using hybridization buffer with 25% 
formamide at 37°C. Washes were performed as in stringent wash protocol. 

Nitrate Reductase Assays. Nitrate reductase assays were performed as described 
in Dunn-Coleman t et al. (18). 

DNA Analysis and Sequence Determination. Sequences were determined using 
fluorescent-labeled dideoxynucleotide terminators and Taq cycle sequencing on the 373A 
sequencer (ABI). Commercially available universal and reverse (New England Biolabs) 
primers were used. Alignment of sequences and prediction of amino acid sequences were 
performed using DNASTAR (DNASTAR, Inc.). The nucleotide and deduced amino acid 
sequences were analyzed and compared to those in GenBank, EMBL and Prot-Swiss using 
Fast A and BLAST programs (Genetics Computer Group, Inc. software package, Madison, 
Wl). 

Other Probes Used for Southern Analysis. The Tan1 probe was prepared by 
digesting Tan1 with Hind\\\ and Stu\ resulting in a 650 bp fragment corresponding to the 3' 
end of the transposase coding region (ORF-probe in Fig. 9). The Vader element was 
digested with Xbal and Seal to yield a 236 bp fragment to be used for recognition of 
internal Vader sequence in Southern analysis (Vader-probe in Fig. 9). 

Southern Analysis to Determine Tan1 Copy Number. Aspergillus genomic DNA 
(10 fig) was digested with EcoRI, which cuts the Tan1 element once in the transposase 
coding region and upstream of sequences corresponding to the Vader and Tan1 probes 
used in hybridizations (Figs. 5, 9 and 10). DNA labeling and detection kit (Genius 1, 
Boehringer Mannheim) was used for random primed labeling of probe DNA with 
digoxigenin and for detection with alkaline-phosphatase labeled antibody to digoxigenin. 
Hybridization and washing conditions were conducted as recommended by the 
manufacturer (Boehringer Mannheim). 

Isolation of Tan1 from a Partial Genomic Library. It was known from the 
sequence of the PCR-amplified Tan1 element that Tan1 did not have restriction enzyme 
recognition sites for BglW and Xho\. A Bgl\\-Xho\ digested Southern blot of Aspergillus niger 
var. awamori genomic DNA, hybridized with the 650 bp Hind\\\-Stu\ Tan1 probe, resulted in 
identification of a 4.5 kb genomic fragment containing Tan1. A. niger var. awamori 
niaD436 DNA was digested with BglW and Xho\ and fragments in a size range of 4-5 kb 
were cloned into pSP73 (Promega). This partial genomic library was screened by colony 
hybridization using the nonradioactive nucleic acid labeling and detection system from 
Boehringer Mannheim. 
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Example 1 

Isolation of Spontaneous High Frequency Reverting 
niaD Mutants of A niger var. awamori 

Assuming that niaD mutants which arise from the insertion of a transposable 

element would be unstable, a total of 1 52 niaD mutants, isolated on the basis of 

spontaneous resistance to chlorate were characterized. To determine if the niaD mutation 

was unstable, spores from 43 niaD mutants were plated onto medium with nitrate as the 

sole nitrogen source. Fourteen of the mutants reverted to the wild-type phenotype at a 

frequency of greater than 1 X 10 5 . Table 1 summarizes the niaD mutant reversion studies. 



Table 1 

Reversion 

Conidia Plated No. Wild-Type Frequency 

Mutant No. x 10 3 Colonies x 10' 4 

niaD392 2.9 27 93 

n/aD410 7.7 5 6.5 

n/aD436 3.7 164 443 

D/aD587 18.9 12 6.3 



There appeared to be two classes of niaD mutants which reverted at high 
frequency. The niaD mutants n/aD436 and niaD392 reverted at high frequency, while 
mutants n/aD410 and niaD587 yielded smaller numbers of revertant colonies. 

The level of nitrate reductase activity was determined using the assay described in 

(18) from revertant colonies isolated from the niaD 436 mutant. Nitrate reductase activity 

was detected in 14 of 15 revertants analyzed (see Table 2). A spectrum of activities was 

detected, suggesting that excision of Vader may not always be precise. 

Table 2 

% Nitrate Reductase Activity 

Strain Compared to Wild-Type 

UVK143f (wild-type) 100 
niaD436 (niaD mutant) ND 1 
Revertants of niaD436: 



1 34.7 

2 42.8 

3 27.7 

4 3.5 

5 ND 1 



1 Activity non-detectable 
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Strain 



% Nitrate Reductase Activity 
Compared .to Wild-Type^^ 



6 
7 
8 
9 
10 
11 
12 
13 
14 
15 



47.4 
90.4 

9.8 
25.4 
28.9 
38.2 

6.9 
71.7 
71.7 
49.7 



Example 2 

Cloning of a Vader Element 



To determine if an insertion sequence was located within the niaD gene, two 
primers were synthesized. The first primer, niaD1 (SEQ ID NO:8), corresponded to position 
142-165 of the niaD gene, and niaD2 (SEQ ID NO: 9) corresponded to position 2738-2715 
of the niaD gene. Genomic DNA was isolated from 14 unstable niaD mutants. This 
genomic DNA served as a template for the PGR primers. PCR reaction products with 4 
niaD mutants (410, 436, 587 and 392) revealed an approximately 440 bp insertion (Vader) 
in the niaD gene. 

For Southern blot analysis, genomic DNA isolated from the wild-type and four niaD 
mutants (410, 436, 587 and 392) was digested with Bglll. The probe used was a Sail 
digestion fragment of the 500 bp PCR product generated using the niaD1 (SEQ ID NO:8) 
and niaD2 (SEQ ID NO:9) oligomeric probes. The probe hybridized to a 2.5 kb fragment 
with wild-type DNA (lane 5, Fig. 1). In the case of the niaD mutants 410 (lane 1, Fig. 1), 
436(lane 3, Fig. 1) and 392 (lane 4, Fig. 1), the probe hybridized to a 2.9 kb fragment. 
These results indicate that these three niaD mutants contain an approximately 440 bp 
insertion. Interestingly, with the mutant n/aD587, the probe hybridized to both a 2.5 kb and 
2.9 kb fragment, although mycelium had been grown in the experiment in the presence of 
KCI0 3 to favor growth of the niaD mutant and not revertant cells, the detection of two 
hybridizable sequences indicated that in some cells Vader had been excised from the niaD 
gene. 

The approximate location of the insertion was determined in each of the four 
unstable niaD mutants by restriction mapping analysis. The location of the insertion in 
each of the four mutants examined is shown in Fig. 2. All four mutants had an 
approximately 440 bp insertion located at different sites within the niaD gene. 
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Example 3 

Determination of Vader Copy Number 
To determine the Vader copy number a 236 bp Scal-Xbal internal fragment of 
Vader-2 (cloned from the mutant n/a0436) was hybridized to EcoRV cleaved genomic 
DNA. There is only one EcoRV site within the Vader transposon. Southern blot analysis 
indicated that there are approximately fifteen copies of Vader sequences in the genome of 
A. niger var. awamori. (Fig. 4). The Vader sequences were integrated at identical genomic 
locations in the three niaD mutants, 410, 436 and 587. However, in the n/aD392 mutant, 
Vader sequences were located in five different locations compared to the three niaD 
mutants examined. This result was somewhat surprising considering that all four niaD 
mutants were isolated from the same strain, but provides good evidence for the high 
mobility of the Vader element in this strain. When a propriety A. ruger glucoamylase 
production strain (ETC #2663) was also examined, approximately 15 hybridization signals 
could be detected. Although some of the hybridization patterns appeared to be identical, 
clear differences could be seen between A. niger var. awamori and A. niger. 

Example 4 
Isolation of Vader in Other Fungal Species 
In an attempt to determine if this transposable element was found in other 
filamentous fungi, genomic Southern blot analysis was performed using the 236 bp 
fragment (Xbal-Scal) of Vader sequence as per Example 3, as a probe (Fig. 5). Two 
strains of A. nidulans were obtained from Fungal Genetics Stock Center (FGSC), FGSC 
#A691, a nitrate reductase structural gene mutant (n/aD15), and FGSC #A237, a 
tryptophan-requiring mutant (frpC801). No hybridization signals could be visualized with 
strain A691, and a single strong hybridization signal could be detected with strain A237. 
These results support the notion that the lack of success in cloning transposable elements 
from laboratory strains of A. nidulans is due to low copy number or absence. Similarly, only 
one hybridization signal could be detected in A. foetidus and A. phoenicis, while two 
hybridization signals were detected in A. cinnamomeus. No hybridizations could be 
detected in A. wentii and A. versicolor. In addition, no hybridization signals could be 
detected with Humicola grisea var. thermoidea, Neurospora crassa and Trichoderma 
reesei (results not shown). These results indicate that the Vader element is most 
commonly found in A. niger var. awamori and A. niger. 
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Example 5 

Excision of the Vader Element 
Part of the niaD gene from n/aD436 containing the Vader element was amplified 
using PCR. The PCR amplification resulted in the expected 1200 bp fragment of the Vader 
element flanked by niaD sequences and a shorter 800 bp fragment resulting from the 
excision event. Sequencing of the shorter fragment indicated that the Vader element had 
excised precisely. However, when several revertants of niaDAZQ and n/aO410 were 
assayed for their nitrate reductase activity (18), a spectrum of activities was detected, 
suggesting that excision of the Vader element may not always be precise (results not 
shown). 

Example 6 

Isolation of Tan1 

The previously isolated Vader element, although mobile, did not have an ORF 
encoding transposase activity presumed to be required for excision (22). This observation 
led to a search for a transposase-encoding larger element, thus an oligomer corresponding 
to the Vader inverted repeat was synthesized and used for PCR amplification of the 
genomic A nigervar. awamori DNA. The PCR amplification resulted in the generation of 
three DNA fragments: the 0.4 kb Vader element, as expected, and fragments of 1.9 kb and 
2.3 kb in length. 

Both of the larger PCR-generated fragments were sequenced and the sequences 
were identical with an exception that the 2.3 kb fragment had an additional 400 bp at the 3' 
end. Surprisingly this additional sequence at the 3' end was a Vader element, which 
differed only by a few nucleotides from the previously isolated Vader. The 5' end 
sequence, shared by both of the 1.9 kb and 2.3 kb fragments, had a single ORF (1668 bp) 
coding for a protein of 555 amino acids flanked by inverted repeats (IRs). Thus, the 1.9 kb 
fragment, devoid of the Vader element, had an organization of IR-ORF-IR. The larger 2.3 
kb fragment had a unique organization, IR-ORF-IR-IR-Vader-IR, with a total of four inverted 
repeats (Figs. 9 and 11). In this larger element the two central inverted repeats, side by 
side, potentially form a tight hairpin structure, and despite many sequencing attempts with 
varying conditions, we were unable to determine the sequence between the two inverted 
repeats. However, the overall length of the PCR product, as determined by 
electrophoresis, corresponded to the size of the sequence shown in Fig. 11, suggesting 
that the two central contiguous IRs are not separated by a large segment of DNA. 

Due to the organization of the 1.9 kb and 2.3 kb fragments, it was believed that the 
1.9 kb fragment could have arisen in PCR from a partial amplification of the 2.3 kb 
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fragment if the 3' IR-primer had annealed to the first centra! IR instead of the IR in the end 
of the Vader element. Southern analysis was conducted in order to determine if the 1 .9 kb 
element existed in the genome without the associated Vader element, or whether it was a 
PCR-artifact derived from a partial amplification of the 2.3 kb element. The two probes 

s used in Southern analysis corresponded to the internal sequence of Vader and to the 
carboxyterminal part of the ORF (Fig. 9). The genomic DNA from A, /?/grervar. awamori 
niaD mutants and UVK143f were digested with EcoRI, which cuts once in the coding region 
of the ORF upstream from the ORF-probe and does not cut Vader. The Southern analysis 
showed numerous bands for the Vader element (Fig. 10), similar to previous Southern 

10 analyses (22). However, only one fragment lit up with the probe corresponding to the ORF 
and a fragment of the same size (1.6 kb) was recognized by the Vader probe (Fig. 10). It 
was concluded that the actual element in the genome was the 2.3 kb fragment and that the 
shorter 1.9 kb had only been a PCR-artifact. The isolated 2.3 kb fragment was designated 
as Tan1. 

15 A genomic clone of the Tan1 element (2.3 kb) was isolated from a partial genomic 

library. Restriction enzymes, which were shown not to have any recognition sites in the 
PCR-amplified Tan1, were used separately and in combinations in Southern analysis of the 
genomic DNA. A double digestion with BglW and Xhol resulted in a relatively short, 4.5 kb, 
fragment which hybridized with the ORF-specific probe (data not shown). Genomic DNA 

20 fragments cleaved by BglW and Xho\ and between 4 kb and 5 kb in size were cloned into 
pSP73 (Promega). The correct clone containing the Tan1 element was isolated by colony 
hybridization using the ORF-specific probe. Differences between the sequences of the 
genomic clone and the PCR-generated Tan1 were minor, even for the flanking IRs which 
were almost identical even though in the PCR-generated Tan1 the IRs were derived from 

25 the Vader IRs (PCR primers). It was seen from the genomic clone of Tan1 that immediately 
outside of the terminal IRs there were TA-dinucleotides, suggesting a TA target site and its 
duplication upon insertion. Sequence of the Tan1 genomic clone is shown in Figs. 11A and 
11B [SEQ ID NO:13 (DNA) and SEQ ID NO:14 (amino acid)]. 

Example 7 

30 Insertional Inactivation/Gene Tagging 

Vader was cloned by insertional inactivation of the target gene niaD, which encodes 
nitrate reductase. The target sequence for integration of Vader is TA, a sequence which 
must be very common in the genome of fungi. Nitrate reductase mutants cannot grow on 
nitrate and inconsequence are resistant to the toxic analog of nitrate, KCIO3- 
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It is possible that one of the reasons heterologous protein production in fungi is 
lower than that of homologously produced protein using the same promoter is that the 
heterologous protein is being degraded by the cell. If there are genes whose products are 
responsible for degrading/sequestering foreign protein, it would be advantageous to 
inactivate those genes. In order to achieve this, a strain is constructed using gene 
disruption, which lacks the Tan1 gene. Such strain is then used to transform and express a 
heterologous protein such as the mammalian chymosin protein. It would be advantageous 
if the activity of such genes could be visualized or selected for on petri dishes. For 
example chymosin produced in A. niger results in a halo of clearing around a colony grown 
on skim milk. (See US Patent 5,364,770, the disclosure of which is incorporated herein by 
reference.) 

Having transformed the strain with a construct comprising the desired heterologous 
protein or polypeptide, one would transform the strain a second time with Vader and Tan1 
appropriately modified for gene tagging purposes. 

The transformants are then plated on medium which can be used to visualize 
heterologous protein production, such as skim milk plates in the case of chymosin. 

The plates are then screened for increased halo size, which is the result of 
inactivation of a gene whose product limits foreign protein production. 

The inactivated gene can be cloned using the transposon sequences as a marker 
for cloning strategies. (See generally (19).) 

Example 8 

Elevation of Gene Expression Using Transposons 
A reason that heterologous protein production is lower than expected in fungi is 

presumed to be that genes essential for foreign (heterologous) gene production are NOT 

expressed at sufficiently high levels in the fungi. 

In order to overcome this problem, utilizing the transposable element(s) of the 

present invention, a strain is constructed in which the native Tan1 gene is inactivated by 

gene disruption. 

This strain is used to express a heterologous protein whose expression can be 
easily visualized, such as chymosin (US Patent 5,364,770). A second transformation is 
made with Vader and Tan1, appropriately modified for gene tagging purposes. The internal 
sequence of Vader is replaced by a promoter sequence. One of the many integration 
events possible will be the integration of this promoter carrying Vader element into 5' to a 
gene beneficial to heterologous protein (e.g., chymosin) expression or secretion. Upon 
insertion, this beneficial gene is activated and such integrant colonies can be screened for, 
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e.g., increased halo size (chymosin). The activated gene can be cloned using the 
transposon sequences as a marker for cloning strategies. 
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Sequence Listing 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Amutan, Maria 

Dunn -Coleman, Nigel 
Nyyssonen, Eini M. 

<ii) TITLE OF INVENTION: Identification of and Cloning a Mobile 
Transposon from Aspergillus 

(iii) NUMBER OF SEQUENCES: 17 

(iv) CORRESPONDENCE ADDRESS: 

(A) NAME: Genencor International, Inc. 

(B) STREET: 925 Page Mill Road 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 94304 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: August 16, 1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Horn, Margaret A. 

(B) REGISTRATION NUMBER: 33,401 

(C) REFERENCE/ DOCKET NUMBER: GC2 7 0-2 

(xi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 846-7536 

(B) TELEFAX: (415) 845-6504 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
ACGTAATCAA CGGTCGGGCG GGCCACACGG TCAGGCGGGC CACCC 



(2) INFORMATION FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GATGGCCCGC CTGACCGTGT GGCCCGCCCG ACCGTTGATT ACGT 4 4 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



ACGTAATCAA 


CGGTCGAACG 


GGCCACACGG 


TCAGGCGGGC 


CATCCTGAAA 


TCCCATATAA 


60 


AAGATGTCTT 


GGGGATTCTA 


TT AT AT AT CA 


ACCAGTACTA 


CTTCTATGAA 


GCTCTAACTT 


120 


TGTAGATAGT 


TATATATATA 


AGAATAAGTA 


TTCCATGAAT 


TTTTCAGATT 


TTAGAATTTT 


180 


TACTTTGATA 


ATGAAACCAG 


ATTCTTATAT 


AAAACATATA 


AATACAGATA 


TTGTAATATG 


240 


ATAAGTCCAT 


AAGTAAAAGT 


AT ATT C AT TT 


TTAGAAGGTA 


TATAGATATT 


ATTTATATTA 


300 


TTTAAAATCT 


ATATAGAAGA 


AATCTAATTC 


TTCTAGACCT 


GGATGGTAGA 


GATATATTAT 


360 


GTT TAAAAAG 


ATATCTTTTG 


TATAGTATTA 


CCAGATGGCC 


CGCCTGACCG 


TGTGGCCCGT 


420 


CCGACCGTTG 


ATT ACGT 










437 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 4 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
ACGTAATCAA CGGTCGAACG GGCCACACGG TCAGGCGGGC CATC 4 4 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GATGGCCCGC CTGACCGTGT GGCCCGTCCG ACCGTTGATT ACGT 4 4 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2325 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ACGTAATCAA CGGTCGGACG GGCCACACGG TCAGGCGGGC CATCCCTTCG AAAAC AC CAC 60 

CTTGAATCAC CT AC C C GAG G CTTTTCAACC AC C ACAAAT G CCAC CAAAAG CATCTATCCC 12 0 

ATCAAAATCG CAGGTGGAGC AGGAAGGCAG GATTCTTCTT GCCATTGAAG CTATTCAGAA 180 

AGGCCAAATC ACTAGTATTC GTGAAGCAGC GCGTGTTTAT GACGTCGCTC GAACTACTCT 24 0 

CCAGGCTCGA T TAT CT GG AC GTGTTTTCGC TAAAAAT AT G ACCAACGCAC GTCAAAAATT 30 0 

GTCAAATAAT GAAGAGGAAT CGCTTGTTAA ATGGATCCTA TCTCTAGATA AG C GAG GAG C 360 

AAGCCCCCGG C CAC T T GAT A T C AGAG AT AT GGCTAATTTG ATTATCTCTA AACGAGGTTA 42 0 

TTCAACTGTT GAACAAGTAG GCATCAACTG GGCTTATAGC TTTGTTAAAC GCCACGAATC 4 80 

CCTACGAACT CGATTTGCTA GACGACTCAA CTATC CAAGA GCTAAAAT GG AGGATCCTGA 54 0 

AGTTATAAAA GACTGGTTCC AACGCGTACA GGAAGTTATT CAAGAGTACG GGAT C T CAT C 600 

AGATGATATA TACAATTTCG AT GAAAC AG G GTTTGCTATG GGAATGATTG CTACATATAA 660 

AGTAGTAACT AGTTCCCAGA GGGCAGGTCG GCCGTCCCTA GTTCAACCAG GGAATCGGGA 72 0 

ATGGGTCACT CCAATTGAGT GTATTCGCTC TAAT G GAG AG GTTCTACCTT CGACCCTGAT 78 0 

CTTTAAAGGC AAAAC AC AT C T AAAG G CAT G GTATGAAGGT CAATCTATTC CTCCTACCTG 840 

GAGATTTGAA GTCAGTGATA ATGGTTGGAC TACT GAT AAA ATTGGACTTC GATGGCTTCC 900 

AAAACACTTC ATTCCCTTGA T T AGAG G C AA ATCAGTAGGC AAAT AT AG C C TCCTAGTCCT 960 

CGATGGCCAC GGTAGTCATT TGACACCTGA ATTCGACCAA TCCTGTGCTG AAAAT GAG GT 102 0 

TATACCTATT TGTATGCCAG CTCATTCGTC C CATC TACT T CAGCCTCTTG ATGTTGGTTG 108 0 

TTTTAGTGTG CTTAAACGCA CGTACGGAGG CATGGTTCCC AAGCAGATGC AATACGGCCG 114 0 

CAAT CAT AT C GACAAGCTTG ACTTCTTAGA GGTCTATCCT AAAGCTCACC AGTGTGCTTT 1200 

AT CAAAGT C G AATATAATCA GTGGTTTTAG AGCAACAGGT CTTGTTCCTC TAGATCCTGA 1260 

TCAAGTGCTT TCTCGACTCC ATATTCGCTT GAAAACACCA CCAACCCCGG ATAGCCAGT C 132 0 

AAGTGGCTCA GTGCTTCAAA CACCACATAA TATAAAACAC CTTTTGGAGC AT C C AAAAT C 138 0 

AGTGGAACGC CTACTTCGGA AACGGCAAGC AAGTCCAACT T CAC C T AC AA ACTCTACACT 144 0 
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ACGTCAGCTT CTCAAAGGGT GTGAACTAGC AATAACAAAC T C AAT CAT AC TGGCTAAGGA 150 0 

GAATGCGGAA TTACGTGCTA GCCATGAAAA GCAACTACCA AAGAGGAAGC GTTCAAGGAA 1560 

GCAGGTGATC TATACAGAAG GCACTACCGT TGAAGAGGCC CAGAGAGCTA TACAGGAAGT 162 0 

GGAAGAGGTG CAGAATGATG AAGATATTGA GGTTGAACCC CAATCTCAAT ATACGGAGAC 168 0 

CCCCTCGCGC GCGCCTCCAC GCTGCAGTAA TTGCTTCAAT AT AG G C C AC C GAC GT AC AC A 1740 

GTGTT CTAAA CCACCTACTA ATTAGTTAGA TAGCT GTTTT TACAAGCATT TATGTTGATT 18 00 

TAGAGGCCTC ATTTGGATCA TATCGGGTAA TCCTACCGGG AGATGGCCCG CCTGACCGTG 18 60 

TGGCCCGCCC GACCGTTGAT TACGTNNNNN AC GT AAT C AA CGGTCGGACG GGCCCCCCGG 192 0 

TCCGGCGGGC CATCTGGTAA TACTATACCA AAGATATCTT TTTAAACATA AT AT AT C T CT 198 0 

ACCATCCAGG TCTAGGAGAA TTAGATTTCT T CT AT ATAGA TTTTAAATAA TATAAATAAT 2 04 0 

AT CTATAT AC CTTCTAAAAA T GAAT AT ACT TTTACTTATG G ACT TAT CAT ATTACAATAT 2100 

CTGTATTTAT AT GT ATTAT A TAAGAAT CT G GTTTCATTAT CAAAGTAAAA ATTCTAAAAT 2160 

CTGAAAAATT C AT GGAATAC TTATTCTTAT ATATATAACT AT CT ACAAAG TTAGAGCTTC 222 0 

ATAGAAGTAG TACTGGTTGA TATATAATAG AATCCCCAAG ACATCTTTTA TAT GGGATTT 22 8 0 

CAGGATGGCC GCCGACCGTG TGGCCCGTCC GACCGTTGAT TACGT 2 32 5 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 555 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Pro Pro Lys Ala Ser lie Pro Ser Lys Ser Gin Val Glu Gin Glu 
1 5 10 15 

Gly Arg lie Leu Leu Ala He Glu Ala He Gin Lys Gly Gin He Thr 
20 25 30 

Ser lie Arg Glu Ala Ala Arg Val Tyr Asp Val Ala Arg Thr Thr Leu 
35 40 45 

Gin Ala Arg Leu Ser Gly Arg Val Phe Ala Lys Asn Met Thr Asn Ala 
50 55 60 

Arg Gin Lys Leu Ser Asn Asn Glu Glu Glu Ser Leu Val Lys Trp He 

65 70 75 80 

Leu Ser Leu Asp Lys Arg Gly Ala Ser Pro Arg Pro Leu Asp He Arg 
85 90 95 

Asp Met Ala Asn Leu He He Ser Lys Arg Gly Tyr Ser Thr Val Glu 
100 105 110 
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Gin Val Gly He 
115 

Leu Arg Thr Arg 
130 

Glu Asp Pro Glu 
145 



He Gin Glu Tyr 



Thr Gly Phe Ala 
180 

Ser Gin Arg Ala 
195 

Trp Val Thr Pro 
210 

Ser Thr Leu He 

225 

Gly Gin Ser He 



Trp Thr Thr Asp 
260 

Pro Leu He Arg 
275 

Asp Gly His Gly 
2 90 

Glu Asn Glu Val 
305 



Leu Gin Pro Leu 



Gly Gly Met Val 
340 

Lys Leu Asp Phe 
355 

Ser Lys Ser Asn 
370 



Leu Asp Pro Asp 
385 



Pro Pro Thr Pro 



His Asn He Lys 
420 

Leu Arg Lys Arg 
435 



Asn Trp Ala Tyr 
120 

Phe Ala Arg Arg 
135 



Val He Lys Asp 
150 



Gly He Ser Ser 
165 

Met Gly Met He 



Gly Arg Pro Ser 
200 

He Glu Cys He 
215 

Phe Lys Gly Lys 
230 



Pro Pro Thr Trp 
245 

Lys He Gly Leu 



Gly Lys Ser Val 
280 

Ser His Leu Thr 
295 

He Pro He Cys 
310 

Asp Val Gly Cys 
325 

Pro Lys Gin Met 



Leu Glu Val Tyr 

360 

He He Ser Gly 
375 

Gin Val Leu Ser 
390 

Asp Ser Gin Ser 
405 

His Leu Leu Glu 



Gin Ala Ser Pro 
440 



Ser Phe Val Lys 



Leu Asn Tyr Pro 
14 0 

Trp Phe Gin Arg 
155 



Asp Asp He Tyr 
170 

Ala Thr Tyr Lys 
185 

Leu Val Gin Pro 



Arg Ser Asn Gly 
220 

Thr His Leu Lys 

235 

Arg Phe Glu Val 
250 



Arg Trp Leu Pro 
265 

Gly Lys Tyr Ser 



Pro Glu Phe Asp 
300 

Met Pro Ala His 
315 

Phe Ser Val Leu 
330 

Gin Tyr Gly Arg 
345 

Pro Lys Ala His 



Phe Arg Ala Thr 
380 

Arg Leu His He 
395 

Ser Gly Ser Val 
410 

His Pro Lys Ser 
425 

Thr Ser Pro Thr 



Arg His Glu Ser 
125 

Arg Ala Lys Met 



Val Gin Glu Val 
160 

Asn Phe Asp Glu 
175 

Val Val Thr Ser 
190 

Gly Asn Arg Glu 
205 

Glu Val Leu Pro 



Ala Trp Tyr Glu 
240 

Ser Asp Asn Gly 
255 

Lys His Phe lie 
270 



Leu Leu Val Leu 
285 

Gin Ser Cys Ala 



Ser Ser His Leu 
320 

Lys Arg Thr Tyr 
335 

Asn His He Asp 

350 



Gin Cys Ala Leu 
365 

Gly Leu Val Pro 



Arg Leu Lys Thr 
400 

Leu Gin Thr Pro 
415 

Val Glu Arg Leu 
430 

Asn Ser Thr Leu 
445 
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Arg 



Gin Leu Leu 
450 



Lys 



Gly Cys Glu Leu Ala lie Thr Asn Ser lie lie 
455 460 



Leu 
465 



Ala Lys Glu 



Asn 



Ala Glu Leu Arg Ala Ser His Glu Lys Gin Leu 
470 475 480 



Pro 



Lys Arg Lys 



Arg 
485 



Ser Arg Lys Gin Val He Tyr Thr Glu Gly Thr 
490 495 



Thr 



Val Glu Glu 
500 



Ala 



Gin Arg Ala He Gin Glu Val Glu Glu Val Gin 
505 510 



Asn Asp Glu Asp He Glu Val Glu Pro Gin Ser Gin Tyr Thr Glu Thr 
515 ~ 520 525 

Pro Ser Arg Ala Pro Pro Arg Cys Ser Asn Cys Phe Asn He Gly His 
530 " 535 540 

Arg Arg Thr Gin Cys Ser Lys Pro Pro Thr Asn 
545 550 555 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CCAACCGAGT CCT C AG TATA GAC 23 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CAACGCTTCA TAGGCGTCCA GATC 24 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AT AT GAATT C AC GTAAT CAA CGGTCGGACG GGCCACACGG TCAGGCGGGC CATC 54 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATATGAATTC CTTCTTGACT TCCCCGGAAC 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
ATATAAGCTT GTCACTGGAC GACATTTCAG 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



ACGTAATCAA 


CGGTCGGGCG 


GGCCACACGG 


TCAGGCGGGC 


CACCCCTTCG 


AAAACAC C AC 


60 


CTTGAATCAC 


CTACCCGAGG 


CTTTTCAACC 


AC CACAAAT G 


CCACCAAAAG 


CATCTATCCC 


120 


ATCAAAATCG 


CAGGTGGAGC 


GGGAAGGCAG 


GATTCTTCTT 


GCCATTGAAG 


C TAT T G AG AA 


180 


AGGCCAAATC 


ACT AGT ATT C 


GTGAAGCAGC 


GCGTGTTTAT 


GACGTCGCTC 


GAACTACTCT 


240 


CCAGGCTCGA 


TTATCTGGAC 


GTGTTTTCGC 


T AAAAAT AT G 


ACCAACGCAC 


GTCAAAAATT 


300 


GTCAAATAAT 


GAAGAGGAAT 


CGCTTGTTAA 


ATGGATCCTA 


TCTCTAGATA 


AGCGAGGAGC 


360 


AAGCCCCCGG 


CCACTTGATA 


TCAGAGATAT 


GGCTAATTTG 


ATTATCTCTA 


AACGAGGTTA 


420 


TTCAACTGTT 


GAACAAGTAG 


GCATCAACTG 


GGCTTATAGC 


TTTGTTAAAC 


GCCACGAATC 


480 


CCTACGAACT 


CGATTTGCTA 


G AC G AC T C AA 


CT AT CAAAGA 


GCTAAAATGG 


AGGATCCTGA 


540 


AGTTATAAAA 


GACTGGTTCA 


AACGCGTACA 


GGAAGTTATT 


CAAGAGTACG 


G GAT CT CATC 


600 


AGATGATATA 


TACAATTTCG 


AT GAAAC AG G 


GTTTGCTATG 


GGAATGATTG 


C T AC AT AT AA 


660 
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AGTAGTAACT AGTTCCCAGA GGGCAGGTCG GCCGTCCCTA GTTCAACCAG GGAATCGGGA 72 0 

ATGGGTCACT GCAATT GAGT GTATTCGCTC TAATGGAGAG GTTCTACCTT CGACCCTGAT 7 80 

CTTTAAAGGC AAAAC AC AT C TAAAGGCATG GTATGAAGGT CAATCTATTC CTCCTACCTG 84 0 

GAGATTT GAA GTCAGTGATA ATGGTTGGAC TACTGATAAA ATT GGACTT C GATGGCTTCA 900 

AAAACACTTC ATTCCCTTGA TTAGAGGCAA ATCAGTAGGC AAATATAGCC TCCTAGTCCT 960 

CGATGGCCAC GGTAGTCATT TGACACCTGA ATTCGACCAA TCCTGTGCTG AAAAT GAG GT 1020 

TATACCTATT TGTATGCCTG CTCATTCGTC CCATCTACTT CAGCCTCTTG ATGTTGGTTG 1080 

TTTTAGTGTG CTTAAACGCA CGTACGGAGG CAT G GTT CAA AAGCAGATGC AATACGGCCG 114 0 

C AAT CAT AT C GACAAGCTTG ACTTCTTAGA GGTCTATCCT AAAGCTCACC AGTGTGCTTT 1200 

ATCAAAGTCG AAT AT AAT CA GTGGTTTTAG AGCAACAGGT CTTGTTCCTC TAGATCCTGA 12 60 

TCAAGTGCTT TCTCGACTCC ATATTCGCTT GAAAACAC C A CCAACCCCGG ATAGCCAGTC 132 0 

AAGTGGCTCA GTGCTTCAAA CAC C AC AT AA TATAAAACAC CTTTTGAAGC AT CCAAAAT C 138 0 

AGTGGAACGC CTACTTCGGA AACGGCAAGC AAGTCCAACT TCACCTACAA ACTCTACACT 14 4 0 

ACGTCAGCTT CTCAAAGGGT GTGAACTAGC AATAACAAAC TCAATCATAC TGGCTAAGGA 1500 

GAATGCGGAA TTACGTGCTA G C CAT G AAAA GCAACTACCA AAGAGGAAGC GTTCAAGGAA 15 60 

GCAGGTGATC TATACAGAAG GCACTACCGT TGAAGAGGCC CAGAGAGCTA TACAGGAAGT 162 0 

GGAAGAGGTG CAGAATGATG AAG AT AT T G A GGTTGAACCC CAATCTCAAT AT AC GGAGAC 1680 

CCCCTCGCGC GCGCCTCCAC GCTGCAGTAA TTGCTTCAAT ATAGGCCACC G AC GT AC AC A 17 4 0 

GTGTTCTAAA CCACCTACTA ATTAGTTAGA TAGCTGTTTT TACAAGCATT TAT GT T GAT T 18 00 

TAGAGGCCTC ATTTTGATCA TATCGGGTAA TCCTACCGAG AGATGGCCCG CCTGACCGTG 18 60 

TGGCCCGCCC GACCGTTGAT TACGTNNNNN AC GT AAT CAA CGGTCGGACG GGCCCCCCGG 1920 

TCCGGCGGGC CATCTGGTAA TACTATACAA AAGATATCTT TTTAAACATA AT AT AT CT CT 1980 

ACCATCCAGG TCTAGGAGAA TTAGATTTCT TCTATATAGA TTTTAAATAA TATAAATAAT 2 04 0 

AT CT AT AT AC CTTCTAAAAA TGAATATACT TTTACTTATG GAC TT AT CAT ATTACAATAT 2100 

CTGTATTTAT AT GT ATT AT A T AAG AAT C T G GTTTCATTAT C AAA GT AAAA ATTCTAAAAA 2160 

T CT GAAAAAT T CAT G G AAT A CTTATTCTTA TATATATAAA CT AT CTACAA AGTTAGAGCT 222 0 

TCATAGAAGT AGTACTGGTT GATATATAAT AGAAT C AAAA AGACAT CTTT TAT AT G G GAT 22 8 0 

TTCAGGATGG CCCGCCTGAC CGTGTGGCCC GTTCGACCGT TGATTACGT 2 32 9 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 555 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Pro Pro Lys Ala Ser lie Pro Ser Lys Ser Gin Val Glu Arg Glu 
1 5 10 15 

Gly Arg lie Leu Leu Ala lie Glu Ala lie Arg Lys Gly Gin lie Thr 
20 25 30 

Ser He Arg Glu Ala Ala Arg Val Tyr Asp Val Ala Arg Thr Thr Leu 
35 40 45 

Gin Ala Arg Leu Ser Gly Arg Val Phe Ala Lys Asn Met Thr Asn Ala 
50 55 60 

Arg Gin Lys Leu Ser Asn Asn Glu Glu Glu Ser Leu Val Lys Trp He 
65 70 75 80 

Leu Ser Leu Asp Lys Arg Gly Ala Ser Pro Arg Pro Leu Asp He Arg 
85 90 95 

Asp Met Ala Asn Leu He He Ser Lys Arg Gly Tyr Ser Thr Val Glu 
100 105 ~ HO 

Gin Val Gly He Asn Trp Ala Tyr Ser Phe Val Lys Arg His Glu Ser 
115 120 125 

Leu Arg Thr Arg Phe Ala Arg Arg Leu Asn Tyr Gin Arg Ala Lys Met 
130 135 140 

Glu Asp Pro Glu Val He Lys Asp Trp Phe Lys Arg Val Gin Glu Val 
145 150 155 160 

He Gin Glu Tyr Gly He Ser Ser Asp Asp He Tyr Asn Phe Asp Glu 
165 170 175 

Thr Gly Phe Ala Met Gly Met He Ala Thr Tyr Lys Val Val Thr Ser 
180 185 " 190 

Ser Gin Arg Ala Gly Arg Pro Ser Leu Val Gin Pro Gly Asn Arg Glu 
195 200 205 

Trp Val Thr Ala He Glu Cys He Arg Ser Asn Gly Glu Val Leu Pro 
210 215 220 

Ser Thr Leu lie Phe Lys Gly Lys Thr His Leu Lys Ala Trp Tyr Glu 
225 230 235 240 

Gly Gin Ser He Pro Pro Thr Trp Arg Phe Glu Val Ser Asp Asn Gly 
245 250 ~ 255 

Trp Thr Thr Asp Lys He Gly Leu Arg Trp Leu Gin Lys His Phe He 
260 265 270 

Pro Leu He Arg Gly Lys Ser Val Gly Lys Tyr Ser Leu Leu Val Leu 
275 280 "* 285 

Asp Gly His Gly Ser His Leu Thr Pro Glu Phe Asp Gin Ser Cys Ala 
290 295 300 
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Glu Asn Glu Val lie Pro lie Cys Met Pro Ala His Ser Ser His Leu 
305 310 ~ 315 320 

Leu Gin Pro Leu Asp Val Gly Cys Phe Ser Val Leu Lys Arg Thr Tyr 
325 330 335 

Gly Gly Met Val Gin Lys Gin Met Gin Tyr Gly Arg Asn His lie Asp 
34 0 " 345 350 

Lys Leu Asp Phe Leu Glu Val Tyr Pro Lys Ala His Gin Cys Ala Leu 
355 360 365 

Ser Lys Ser Asn lie He Ser Gly Phe Arg Ala Thr Gly Leu Val Pro 
370 375 380 

Leu Asp Pro Asp Gin Val Leu Ser Arg Leu His He Arg Leu Lys Thr 
385 ~ 390 395 400 

Pro Pro Thr Pro Asp Ser Gin Ser Ser Gly Ser Val Leu Gin Thr Pro 
405 410 415 

His Asn He Lys His Leu Leu Lys His Pro Lys Ser Val Glu Arg Leu 
420 425 430 

Leu Arg Lys Arg Gin Ala Ser Pro Thr Ser Pro Thr Asn Ser Thr Leu 
435 440 445 

Arg Gin Leu Leu Lys Gly Cys Glu Leu Ala He Thr Asn Ser He He 
450 "* 455 460 

Leu Ala Lys Glu Asn Ala Glu Leu Arg Ala Ser His Glu Lys Gin Leu 
465 " 470 475 480 

Pro Lys Arg Lys Arg Ser Arg Lys Gin Val He Tyr Thr Glu Gly Thr 
485 490 495 

Thr Val Glu Glu Ala Gin Arg Ala lie Gin Glu Val Glu Glu Val Gin 
500 505 510 

Asn Asp Glu Asp He Glu Val Glu Pro Gin Ser Gin Tyr Thr Glu Thr 
515 520 525 

Pro Ser Arg Ala Pro Pro Arg Cys Ser Asn Cys Phe Asn lie Gly His 
530 ^ 535 54 0 

Arg Arg Thr Gin Cys Ser Lys Pro Pro Thr Asn 
545 550 555 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
ACGTAAT CAA CGGTCGGACG GGCCCCCCGG TCAGGCGGGC CATC 4 4 

(2) INFORMATION FOR SEQ ID NO: 16: 



WO 98/08960 



34 



PCT/US97/14978 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GGATGGCCCG CCTGACCGTG TGGCCCGTTC GACCGTTGAT TACGT 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 44 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
ACGTAATCGG TAAGCGAGTT GCCCGCGCAA GCGAGTTGCC CACC 



44 
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What is Claimed 

1 . A transposable element isolated from Aspergillus nigervar. awamori comprising a 
DNA fragment of about 2.3 kb which comprises SEQ ID NO: 1. 

2. The transposable element of Claim 1 comprising the DNA sequence of SEQ ID 
NO: 13 or variations thereof. 

3. A fragment of the transposable element of Claim 1 comprising part or all of the DNA 
sequence selected from the group consisting of SEQ ID NOS:1 and 16 or variations 
thereof. 

4. An isolated transposase coded for by the transposable element of Claim 1 . 

5. The transposase of Claim 4 comprising the amino acid sequence of SEQ ID NO:14. 

6. A method of isolating a transposable element from a filamentous fungus, comprising 
the steps of: 

(a) hybridizing fungal DNA under low stringency conditions to a probe, wherein 
the probe comprises part or all of one of the DNA fragments of Claim 3; and 

(b) isolating fungal DNA which hybridizes to said probe. 



7. The method of Claim 6 wherein the probe comprises an imperfect direct repeat 

within the DNA sequence selected from the group consisting of SEQ ID NOS:1 and 
16. 



8. A method of isolating a transposable element from a filamentous fungus genomic 
library, the method comprising probing said library with an ORF-specific probe and 
isolating DNA which hybridizes to said ORF-specific probe. 

9. A method of isolating a transposable element from a filamentous fungi, the method 
comprising, 

(a) subjecting fungal DNA to polymerase chain reaction amplification using 
part or all of one of the DNA fragments of Claim 3 as a primer, thereby 
generating amplified DNA sequences; 

(b) isolating the amplified DNA sequences; and 

(c) optionally identifying said amplified DNA sequence. 



10. 



A transposable element isolatable by the method of Claim 6, 7, 8 or 9. 
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11. A method of isolating activation sequences comprising: 

(a) inserting a marker gene within the inverted repeats of a transposable element of 
Claim 1 to form a modified marker gene having the structure IR-marker-IR; 

(b) inserting the modified marker gene into a DNA target; 

(c) selecting for expression of the modified marker; and 

(d) isolating DNA upstream of said modified marker gene in said DNA target, which 
DNA upstream of said modified marker gene comprises an activation sequence 
driving expression of said modified marker gene. 

12. A method for inactivating a gene in a host cell wherein said gene encodes a gene 
product, the method comprising: 

(a) transforming a host cell with a genetic element to create a transformed host cell, 
wherein the genetic element comprises DNA for the gene and a transposable element of 
Claim 1 inserted within the DNA; and 

(b) selecting for the transformed host cells which are deficient in the gene product. 

13. A method for activating a desired gene in a host cell, the method comprising: 

(a) inserting a regulatory gene within the inverted repeats of a transposable element 
of Claim 1 to form a modified regulatory gene having the structure IR-regulatory gene-IR; 

(b) inserting the modified regulatory gene in DNA comprising the desired gene to 
form a DNA construct containing the modified regulatory gene upstream of said desired 
gene; 

(c) transforming the host cell with the DNA construct; and 

(d) selecting for transformants having enhanced expression of said desired gene. 

14. A transposable element isolated from Aspergillus oryzae comprising a DNA 
fragment of at least about 1.2 kb. 

15. The transposable element of Claim 14 comprising an inverted repeat DNA 
sequence of SEQ ID NO: 17 or a variation thereof. 

16. A fragment of the transposable element of 14 comprising part of all of the DNA 
sequence of SEQ ID NO: 17 or a variation thereof. 
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17. A method of isolating a transposable element from a filamentous fungus, comprising 
the steps of: 

(a) hybridizing fungal DNA under low stringency conditions to a probe, wherein 
the probe comprises part or all of one of the DNA fragments of Claim 16; 
and 

(b) isolating fungal DNA which hybridizes to said probe. 



18. A method of isolating activation sequences comprising: 

(a) inserting a marker gene within the inverted repeats of a transposable element of 
Claim 10 to form a modified marker gene having the structure IR-marker-IR; 

(b) inserting the modified marker gene into a DNA target; 

(c) selecting for expression of the modified marker; and 

(d) isolating DNA upstream of said modified marker gene in said DNA target, which 
DNA upstream of said modified marker gene comprises an activation sequence 
driving expression of said modified marker gene. 



19. A method for inactivating a gene in a host celt wherein said gene encodes a gene 
product, the method comprising: 

(a) transforming a host cell with a genetic element to create a transformed host cell, 
wherein the genetic element comprises DNA for the gene and a transposable element of 
Claim 10 inserted within the DNA; and 

(b) selecting for the transformed host cells which are deficient in the gene product. 

20. A method for activating a desired gene in a host cell, the method comprising: 

(a) inserting a regulatory gene within the inverted repeats of a transposable element 
of Claim 10 to form a modified regulatory gene having the structure IR-regulatory gene-IR; 

(b) inserting the modified regulatory gene in DNA comprising the desired gene to 
form a DNA construct containing the modified regulatory gene upstream of said desired 
gene; 

(c) transforming the host cell with the DNA construct; and 

(d) selecting for transformants having enhanced expression of said desired 

gene. 
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tgtcgacggctctctggactggcccaatgatggcagatatcctacggagtgcga 
aacctttaagqaaaaccaaqta ACGTAATCM CGGTCGAACG GGCCACACGG 
TCAGGCGGGC CATC CTGAAA TCCCATATAA AAGATGTCTT GGGGATTCTA 
TTATATATCA ACCAGTACTA CTTCTATGAA GCTCTAACTT TGTAGATAGT 
TATATATATA AGAATAAGTA TTCCATGAAT TTTTCAGATT TTAGAATTTT 
TACTTTGATA ATGAAAC C AG ATTCTTATAT AAAACATATA AATACAGATA 
TTGTAATATG ATAAGTCCAT AAGTAAAAGT ATATTCATTT TTAGAAGGTA 
TATAGATATT ATTTATATTA TTTAAAATCT ATATAGAAGA AATCTAATTC 
TTCTAGACCT GGATGGTAGA GATATATTAT GTTTAAAAAG ATATCTTTTG 
TATAGTATTA CCA GATGGCC CGCCTGACCG TGTGGCCCGT CCGACCGTTG 
ATTACGT tatgtctgcatggagggagctgataagctggtaagttaccttatcca 
tccatgcatgcagtgccctga 

FIG.. 7 
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Tan1,2324 bp: 



9/13 
EcoRl 



ORF-probe Vader-p. 
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Vader, 437 bp: 



FIG..9 
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